feat(cluster_healthcheck): add cluster health validation role#39
feat(cluster_healthcheck): add cluster health validation role#39stevefulme1 wants to merge 2 commits into
Conversation
sabre1041
left a comment
There was a problem hiding this comment.
Review the issues that are being reported.
Also, please review conflicted files
| kind: Pod | ||
| namespace: "{{ cluster_healthcheck_kubevirt_namespace }}" | ||
| label_selectors: | ||
| - "app=cdi-operator" |
There was a problem hiding this comment.
This label does not match what is deployed
| kind: Pod | ||
| namespace: "{{ cluster_healthcheck_kubevirt_namespace }}" | ||
| label_selectors: | ||
| - "app=cdi-deployment" |
There was a problem hiding this comment.
This label does not match what is deployed
|
|
||
| - name: mtv_health | Evaluate Provider readiness | ||
| ansible.builtin.set_fact: | ||
| __cluster_healthcheck_providers_not_ready: >- |
There was a problem hiding this comment.
This is not reporting correctly. Both providers are Ready in my testing environment
| | selectattr('status.phase', 'equalto', 'Running') | ||
| | list | length) }} | ||
|
|
||
| - name: network_health | Check migration network configuration |
There was a problem hiding this comment.
This should only be checked if one has been defined in the HyperConverged CR
| kubernetes.core.k8s_info: | ||
| api_version: k8s.cni.cncf.io/v1 | ||
| kind: NetworkAttachmentDefinition | ||
| namespace: "{{ cluster_healthcheck_mtv_namespace }}" |
There was a problem hiding this comment.
This should check in the openshift-cnv namespace
Adds a cluster_healthcheck role that validates OpenShift cluster health for virtualization migration readiness across six categories: OCP nodes, KubeVirt, MTV, storage, network, and post-migration VMs. Generates an HTML summary report with pass/fail/warning status. Review feedback addressed: - Fix CDI pod labels to use app.kubernetes.io/component selectors - Fix Provider readiness to correctly detect Ready condition status - Make migration network check conditional on HyperConverged CR config - Check migration NAD in openshift-cnv namespace, not openshift-mtv - Drop unrelated scaffolding file changes (CODE_OF_CONDUCT, etc.)
d4928cd to
51d077e
Compare
|
@stevefulme1 looking better. still seeing misalignment on the CDI components. Here are the labels that are applied to the CDI pods |
…sing components - Change CDI label selectors from app.kubernetes.io/component to cdi.kubevirt.io which matches actual pod labels on OCP 4.21+ - Add cdi-apiserver and cdi-uploadproxy pod health checks (were missing) - Add CDI API Server and CDI Upload Proxy to the health report details
|
Fixed the CDI label selectors and added the missing components:
All four CDI pods are now covered:
|
Summary
Adds a new
cluster_healthcheckrole that validates the health of an OpenShift cluster for virtualization migration readiness. The role performs comprehensive checks across six categories and generates an HTML summary report with pass/fail/warning status and actionable recommendations.Health checks included
Files added
Design decisions
validate_migrationrole patterns (task naming, k8s_info usage, variable prefixing)cluster_healthcheck_per collection convention__cluster_healthcheck_double-underscore prefixkubernetes.core.k8s_info,ansible.builtin.*)cluster_healthcheck_checksdefaultcluster_healthcheck_post_migration_vmsTesting
ansible-lint --profile productionpasses with 0 errors on the role (playbook FQCN resolution matches existing collection behavior)